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Abstract 

We consider optimization problems with polynomial inequality constraints in non- 
commuting variables. These non-commuting variables are viewed as bounded operators 
on a Hilbert space whose dimension is not hxed and the associated polynomial inequal- 
ities as semidefinite positivity constraints. Such problems arise naturally in quantum 
theory and quantum information science. To solve them, we introduce a hierarchy of 
semidefinite programming relaxations which generates a monotone sequence of lower 
bounds that converges to the optimal solution. We also introduce a criterion to de- 
tect whether the global optimum is reached at a given relaxation step and show how 
to extract a global optimizer from the solution of the corresponding semidefinite pro- 
gramming problem. 



1 Introduction 



A standard problem in optimization theory is to find the global minimum of a polynomial 
on a set constrained by polynomial inequalities, that is, to solve the program 

p* = min p(x) 

s.t. q%(x) > i = l,...,m, 

where and qi(x) are real- valued polynomials in the variable x G M. n . To deal with such 
non-convex problems, Lasserre pQ introduced a sequence of semidefinite programming (SDP)0 
relaxations of increasing size, whose optima converge monotonically to the global optimum 
p*; a similar approach has been proposed by Parrilo [2J. This paper presents a generalization 
of Lasserre's method for a non-commutative version of the optimization problem ([1]). That is, 
we consider a polynomial optimization problem where the variables not 
simply real numbers, but non-commuting (NC) variables for which, in general, XiXj ^ XjXi. 
Our motivation comes from quantum theory, whose basic objects are matrices and operators 
that do not commute. But our approach might also find applications in other fields that 
involve optimization over matrices or operators, such as in systems engineering [3]. 

To write down the non-commutative version of ([1]), let p(x) and qi(x) be polynomial 
expressions in the non-commuting variables x = (xi, . . . , x n ). Given an Hilbert space H and 
a set X = (Xi, . . . , X n ) of bounded operators on H, we define operators p(X) and qi(X) by 
substituting the variables x by the operators X in the expressions p(x) and qi{x). Given in 
addition a normalized vector in H, we evaluate the polynomial p{X) as (0,p(X)0). The 
non-commutative version of the optimization problem (pQ) considered here is then 

p* = min ((f>,p(X)4>) 

(H,X,<j>) (2) 

s.t. qi(X) y i — l,...,m, 

where qi{X) >z means that the operator qi(X) should be positive semidefinite. In other 
words, given the input data p(x) and qi{x), we look for the combination (H, X, 0) of Hilbert 
space H, operators X, and normalized state <fi (both defined on H) that minimizes (0,p(X)0) 
subject to the constraints q%{X) y 0. It is important to note that the dimension of the Hilbert 
space H is not fixed, but subject to optimization as well. 

Taking inspiration from Lasserre's method [TJ and from the papers jH [3] , we introduce 
a hierarchy of SDP relaxations for the above optimization problem. The optimal solutions 
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of these relaxations form a monotonically increasing sequence of lower bounds on the global 
minimum p*. We prove that this sequence converges to the optimum p* when the set of 
constraints qi(X) ^ is such that every tuple of operators X = (X ly . . . , X n ) satisfying 
them are bounded, i.e., such that they satisfy C 2 — (X x + • • ■ + X n ) >z for some real 
constant C > 0. Our proof is constructive: from the sequence of optimal solutions of the 
SDP relaxations, we build an explicit global minimizer (H*,X*,4>*) for (J2J), where H* is, 
in general, infinite-dimensional. In some cases, the SDP relaxation at a given finite step 
in the hierarchy may already yield the global minimum p*. We introduce a criterion to 
detect such events, and show in this case how to extract the global minimizer (H* , X* , <f>*) 
from the solution of this particular SDP relaxation. The resulting Hilbert space H* is then 
finite-dimensional, with its dimension determined by the rank of the matrices involved in 
the solution of the SDP relaxation. 

Our method can find direct applications in quantum information science, e.g. to compute 
upper-bounds on the maximal violation of Bell inequalities, and in quantum chemistry to 
compute atomic and molecular ground state energies. Practice reveals that convergence is 
usually fast and finite (up-to machine precision). 

1.1 Relation to other works 

Unconstrained NC polynomial optimization problems (i.e. the minimization of a single poly- 
nomial p(X) with no constraints of the form qi{X) >z 0) were considered in [5]. Such problems 
can also be solved using SDP, as implemented in the MATLAB toolbox NCSOStools [7]. Un- 
like constrained NC optimization (jSJ), which requires a sequence of SDPs to compute the 
minimum, for unconstrained NC optimization a single SDP is sufficient by a theorem of 
Helton that a symmetric NC polynomial is positive if and only if it admits a sum of square 
decomposition [8]. This single SDP corresponds actually to the first step of our hierarchy 
when neglecting constraints coming from the conditions qi(X) >z 0. 

In jl], a subclass of the general constrained NC problem fl2]) which is of interest in quantum 
information (see (16T|) later in the text) was considered and a sequence of SDP relaxations 
introduced for it. The convergence of this SDP sequence was established in [5j [9j [10]. Our 
work can be seen as a generalization of these results to arbitrary NC polynomial optimization. 

In the commutative case, the convergence of the relaxations introduced by Lasserre is 
based on a sum of squares representation theorem of Putinar [TT] for positive polynomials. 



3 



The connection to Putinar's representation arises when considering the dual problems of the 
SDP relaxations. A non- commutative analogue of Putinar's result, the Positivstellensatz 
for non-commutative positive polynomials, has been introduced by Helton and McCullough 
[12] . Although we first prove the convergence of the hierarchy introduced here through the 
primal version of our SDP relaxations (in the spirit of [5]) we also provide an alternative 
proof through the duals, which exploits Helton and McCullough's result (as used in [H] and 

ma). 

Note that the problem ([2]) can also account for equality constraints q%{X) = 0, which can 
be enforced through the inequalities qi{X) >z and — qi(X) >z 0. When constraints of the 
form explicitly added to ([21 , that is, when we require that the variables x 

commute, our method reduces to the one introduced by Lasserre. It is interesting to note that 
the results presented here, such as the convergence of the hierarchy or the criterion to detect 
optimality, are easier to establish in the general non-commutative framework than they are 
in the specialized commutative case. This commutative setting has generated quite a large 
literature and the properties of the corresponding SDP relaxations have been thoroughly 
investigated. We refer to [13] for a review. Our work provides a NC analogue only of the 
most basic results in the commutative case. It would be interesting to reexamine from a NC 
perspective other topics in this subject. 

1.2 Organization of the paper 

In Section 2, we define some notation and introduce in more detail the class of problems 
that we consider here. Section 3 contains our main results: we introduce our hierarchy of 
SDP relaxations, prove its convergence, show how to detect optimality at a finite step in 
the hierarchy and how to extract a global optimizer. We then explain the relation between 
our approach and the works of Helton and McCullough. We proceed by mentioning briefly 
how to modify our method to deal efficiently with equality constraints. In particular, we 
discuss how it can be simplified when dealing with hermitian variables and how it reduces 
to Lasserre's method in the case of commuting variables. We end Section 3 by showing how 
our method can be extended to solve a slightly more general class of optimization problems. 
In Section 4, we illustrate our method on concrete examples. Finally, we briefly discuss 
practical applications of our method in the quantum setting in Section 5. 
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2 Notation and definitions 



Let IK G {IR, C} denote the field of real or complex numbers. We consider the algebra K[x, x*] 
of polynomials in the 2n noncommuting variables x — (xi, . . . , x n ) and *) with 

coefficients from IK. That is, an element p G K[x, x*] is a linear combination 



of words w in the 2n letters x and x*, where the sum is finite and p w G K. We interpret * as 
an involution (that is, loosely speaking, a conjugate transpose) defined as follows: on letters, 
(xi)* = x* and (x*)* = xf, on a word w — w± . . . w n , w* = u>* . . . w{; and on a polynomial, 
p* = J2 w Pw w *i where p^ is the complex conjugate of p w . Thus IK[x,x*] is the free *-algebra 
generated by the 2n variables x±, . . . , x n , x\, . . . , x*. In the following, we will often view these 
2n variables as x±, . . . , x n , x n+ i, . . . , x 2n by identifying x n+i with x*. 

Throughout this paper, the symbols u, v, w always denote words and we denote the 
empty word by I. We use the notation Wd for the set of all words of length \w\ at most 
d, and Woo for the set of all words (of unrestricted length). The number of words in Wa is 
\Wd\ = {(2n) d+l — l) / (2n— 1). The degree of a polynomial p is the length of the longest word 
in p and is denoted deg(p). We let K[x, x*]a denote the set of polynomials p = J2\ w \<dPw w 
of degree < d. If necessary, a polynomial of degree d can be viewed as a polynomial of higher 
degree d! by setting to zero the coefficients associated with words of length larger than d. 
A polynomial p is said to be hermitian if p* = p, or in term of its coefficients, if p* w = p w * . 
Note that words can be interpreted as monomials and we will sometimes use the two terms 
interchangeably. We will then also refer to the length \w\ of a word as the degree of the 
monomial w and to Wd as a monomial basis for K[x, x*]d- 

Let B(H) denote the set of bounded operators on a Hilbert space H defined on the 
field K. Consider a set of operators X = (X 1; . . . ,X n ) from B(H). Given the polynomial 
p G x*], we define the operator p(X) G B(H) by substituting every variable Xj by the 
operator X{ and every variable x* by X*, where * denotes the adjoint operation on H . If 
p* = p is a hermitian polynomial, then p(X) = p*(X) is a hermitian operator and the 
quantity ((f>,p(X)(f>) is real for every vector <fi in H. A hermitian operator O is said to be 
positive semidefinite, a fact that we denote by O >z 0, if (0, 0(f)) > for all (f> G H. 




(3) 



w 
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2.1 Formulation of the optimization problem 

Let p and q^ (i = 1, . . . , m) be hermitian polynomials in K[x, x*]. We are interested in the 
following optimization problem: 



p 



(H,X,<t>) 




(4) 



P : 



s.t. qi(X) y 



I = 



1,... 



m , 



where the optimization should be understood over all Hilbert spaces H (of arbitrary di- 
mension), all sets of operators X = (X 1; . . . ,X n ) in B(H), and all normalized vectors in 
H . We assume throughout the remaining of the paper that this problem admits a feasible 
solution, that is, that there exists a triple (if, X, 0) such that (0,0) = 1 and qi(X) y for 
i = 1, . . . , m. 

Let Q = {qi : i = 1, . . . , m} be the set of polynomials determining the positivity con- 
straints in (jlj). The following definitions follow those used in [14J. The positivity domain Sq 
associated to Q is the class of tuples X = (Xi, . . . ,X n ) of bounded operators on a Hilbert 
space making each qi(X) a positive semidefinite operator. The quadratic module Mq is the 
set of all elements of the form Yliififi + Ylii^Zj 9ij1i9ij where /j and gry are polynomials 
in K[x, x*]. We say that Mq is Archimedean if there exists a real constant C such that 
C 2 — (x\x\ + • — h x\ n X2n) G Mq. In this case, the positivity domain Sq is bounded: for all 
X E Sq, C 2 — (X*Xx + • • • + X2 n X 2n ) y 0. Note that if Sq is bounded, we can always add 
C 2 — {x\x\ + • • • + X2 n X2n) to Q for a sufficiently large C to make Mq Archimedean without 
changing Sq. In the following we will always assume that the constraints in Q are such that 
Mq is Archimedean. 

3 Main results 

3.1 Moment and localizing matrices 

Let y = (y w )\w\<d £ K' Wd l be a sequence of real or complex numbers indexed in Wd, i.e., 
to each word w G Wd corresponds a number y w G K. We define the linear mapping L y : 
K[x, x*] d H>Kas 




(5) 



w\<d 



6 



By analogy with p], given a sequence y = (y w )\ w \<2k indexed in W 2 k, we define the moment 
matrix M k (y) of order fcasa matrix with rows and columns indexed in W k and whose entry 
(v,w) is given by 

M k (y)(v, w) = L y (v*w) = y v * w . (6) 

If q = ^2\ u i <d quU i s a polynomial of degree d and y = (y w )\ w \<2 k +d a sequence indexed in 
y^2k+di we define the localizing matrix M k (qy) as the matrix with rows and columns indexed 
in Wfc, and whose entry (v,w) is 

M k (qy)(v,w) = L y (v*qw) = ^ q u y v *uw ■ (7) 

\u\<d 

We say that a sequence y = (y w )\w\<2k admits a moment representation, if there exists a 
triple (H, X, (f>) with a normalized <fr such that 

y w =(<f>MX)<l>), (8) 

for all \w\ < 2k. 

Lemma 1. Let y = (y w )\ w \<2k be a sequence admitting a moment representation. Then 
yi — 1 and M k (y) >z 0. If the moment representation is such that q(X) >z for some 
q G then in addition M k _ d (qy) >z 0, where d = \deg(q)/2] . 

Proof. Eq. (jSJ) immediately implies y\ — 1 since (0, 0) = 1. The positivity of the moment 
matrix M k (y) follows from the fact that for any vector z G 



z*M k (y)z = ^ z* v M h {y){v, w)z w = ^ z* v y v * w z w 

v,w v,w 

= ( ( p,J24v*(X)J2^MX)<P) = (^z*(X)z(X) ( p)>0, (9) 

where we have defined the operator z(X) = J2 W z ww(X). 

Suppose now that y admits a moment representation (jSJ) by a triple (H, X, <p) such that 
q(X) y 0. Then M k „ d (qy) y since for all vectors z G Kl Wfc - d l, 

z*M k _ d (qy)z = ^ z* u M k _ d (qy)(v,w)z w = ^ z* v q u y v * uw z w 

v,w v,w,u 

= (<p,J24v*(x)J2quu(x)J2zMX)<P) 

V u w 

= (<f>,z*(X)q(X)z(X)<f>)>0, (10) 
where the last inequality follows from the fact that q(X) >z 0. □ 
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3.2 Convergent SDP relaxations 

For 2k > max {deg(p), maxj deg(gj)}, consider the SDP problem 

p k = min J2 w p w y w 
y 

R k : S - t y ' = 1 (11) 

M k (y) h 

M k _ di {qiy) ^ i = 1, ...,m, 

where di = |~deg(gj)/2] and the optimization is over y = (y w )\ w \<2k £ Kl W2fc L The optimum p k 
provides a lower-bound on the global optimum p* of the original problem P, since any feasible 
solution (H, X, (f>) of P yields a feasible solution y of Rk through Eq. ([H]) and Lemma [TJ We 
refer to Rk as the SDP relaxation of order k of P. Since the positivity of the moment and 
localizing matrices of a given order k! implies the positivity of the moment and localizing 
matrices of lower orders k, the sequences of SDP relaxations form a hierarchy in the sense 
that p k < p k ' when k < k'. 

Theorem 1. J/Mq is Archimedean, \imk^ooP k = p* ■ 

Remember that if Mq is Archimedean, there exists polynomials fi and gij and a constant 

C such that C 2 - {x\xi + h x* 2n x 2n ) = J2i fifi + J2i J2j 9*fli9ij- in the following, we 

write g?m = m ax i:) {deg(/j), deg(^jj) + di}. Note that g?m > 1, with d M = 1 when C 2 — 
{x\x\ + • ■ ■ + X2 n X2n) is contained in Q. Although the asymptotic behavior of the hierarchy 
of SDP relaxations only depends on the quadratic module being Archimedean, it may be 
advantageous in practice to add the constraint C 2 — (x\xi + • • ■ + a^n^n) to Q. This will 
guarantee in particular that the first step of the hierarchy has a bounded solution (see 
Lemma [3]) . 

The proof of Theorem 1 is based on the following four lemmas. 

Lemma 2. Let c = C 2 — (x\x\ + • • • + x* 2n X2n) and let y be a sequence satisfying yi = 1, 
M k {y) y 0, and M fc _i(cy) h 0. Then \y w \ < for all \w\ < 2k. 

Proof. The diagonal elements of M k -i(cy) are of the form C 2 y w * w — Yli=iyw*x*xiw with 
\w\ < k — 1. Since the localizing matrix M k „i(cy) is positive semidefinite, these diagonal 
entries must be positive, that is, Y^ 1 y w * x *x i w < C 2 y w * w . In addition, it also holds that 
Vw*x*xiw > since these numbers are diagonal entries of the moment matrix M k (y). It thus 



8 



follows that y w * x *xiw — C 2 y w * w for all \w\ < k — 1 and all % = 1, . . . , 2n. Given that y\ = 1, 
we deduce by induction that y w * w < C 2 ^ for all \w\ < k. 
The moment matrix M k (y) admits the following matrix 

yw*w yw*v 
yv*w yvv 

as a submatrix, where \w\, \v | < k. Since M k (y) >z 0, the above submatrix must also 
be positive semidefinite, which is equivalent to the condition that y w * v y v * w < y w *wyv*v 
Combining this relation with the previous bound on y w * w and the fact that y^*w — yto*v 
which follows from the hermicity of Mk(y), we deduce that \y w \ < C' w ' for all \w\ < 2k. □ 

Lemma 3. Let 2k > max {deg(p), maxj deg(gj)} and let Mq be Archimedean. Let y be a 
feasible solution of the relaxation Rk-i+d M - Then \y w \ < C'™' for all \w\ < 2k. 

Proof. First note that if / £ K[x, x*]d is a polynomial of degree d and y a sequence such 
that M k+d (y) >z 0, then M k (f*fy) >z 0. This follows from the fact that M k+d (y) >z im- 
P lies T,\v\M<k^\W<dKft L v( v *v*ww)zwU > for all z £ Kl Wfe l and from the identity 
Ei^^i^Eifii,^!^^/!^^^*^)^/™ = 2~2\ v \,\ w \<k z vL y (v*f*fw)z w = z*M k (f*fy)z. Sim- 
ilarly, if g £ K[x, x*]d is a polynomial of degree d and y a sequence such that M k+d (qiy) >z 0, 
then M k (g*qigy) >z 0. Indeed, from M k+d (qiy) ^ we deduce that for all z £ K' Wfc ', 

5^|u|, |w|<fe 5-^|C|,|ti}| 

<d z v9i,Ly(v*v*qiWw)z w g w > 0, and the left-hand side of this last inequality 
is equal to ^,^\< k z*L y (v*g*q i gw)z w = z*M k (g*q t gy)z. 

Now, let y be the optimal solution of the relaxation Rk-i+d M as i n ^ ne statement of 
the lemma and let c = C 2 — (x\x\ + • • • + xj^a^n)- As Mq is Archimedean, we can write 

c = Ei/T/i + E ,,.'/;'//' .'/'./• and thus M k-i(cy) = Y,i M k-i{fifoy) + Ey ^-1(^9^1/) ■ 

Since M fc _ 1+dM (y) ^ and M k _ 1+dM _ di (qiy) >z 0, the argument outlined here above implies 
that M k -i(f*fiy) h and M k _ 1 {g*-q i g i jy) >z 0. This in turn implies M k _i(cy) >z 0. From 
Lemma |2l we then deduce that \y w \ < C' w ' for all \w\ < 2k. □ 

Lemma 4. If Mq is Archimedean, the optima p k of the relaxations Rk form, for k large 
enough, a monotically increasing bounded sequence. Therefore, the limit p = \im k ^ OQ p k 
exists. 

Proof. Let I — I' — 1 + g?m with 21' > max {deg(p), max, deg(gj)}, and let y be the solution of 
the relaxation Ri with objective value p l . From Lemma [3], the entries y w with \w\ < 2V are 
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bounded, i.e., \y w \ < C^. Thus the solution p l = Yl\ w \<i' Pwyw is bounded as well. We also 
have that p* is bounded since for Mq Archimedean, the positivity domain Sq is bounded. 
For all k > I, p l < p k < p k+1 < p*. Thus the (p k ) k>l form a monotonely increasing bounded 
sequence and the limit p = lim^oo p k exists. □ 

Lemma 5. Let Mq be Archimedean and let p = limfc^oo p k be the limit of the optimal 
solutions p k of the relaxations Rk. Then there exists an infinite sequence y = (y w )\ w \=o,i,... 
indexed in such that \y w \ < C^, 

^Pwilw = P, 

w 

yi = i, (13) 

and 

M k {y) y 0, 

M fc _ dj ( ?i y) y i = l,...,m (14) 

for all k large enough. 

Proof. For any k such that 2k > max {deg(p), deg(gj)}, let y k ~ l+d ^ be a feasible solution of 
the relaxation Rk-i+d M with objective value p. Such a solution always exists because the 
problem Rk-i+d M i s convex and there exist feasible points of Rk-i+d M with optimal values p\ 
and P2 satisfying p\ < p < P2 (take for instance p\ = p k ~ 1+d ^ and P2 = p*). By Lemma |3l the 
entries y^~ 1+d ™ with \w\ <2k are bounded, i.e., |?/^ _1+dM | < C'™L Let y k be the restriction 
of the solution y k ^ 1+d ^i to the \w\ < 2k. That is, y k = (y^~ 1+dM )\ w \<2k is the subsequence of 
yk-i+d M composed of the entries yk-i+d M with \w\ < 2k. Complete y k with zeros to make 
it an infinite vector y k in and perform the renormalization y^ — >■ = y^/C^. Each 
vector z k thus belongs to the unit ball of l^, and the sequence (z k ) admits by the Banach- 
Alaoglu theorem a subsequence (z ki ) i= i :2) ... that converges in the weak-* topology to a limit 
linij.j.oo z kl = z [15J. This implies in particular pointwise convergence, i.e., lim^oo^ = z w 
for all w. Define the infinite vector y through y w = z w C^. The pointwise convergence 
z ki — > z implies the pointwise convergence of y kl — > y, i.e., lim^oo yj* = y w for all w. Since 
Y.wPvVi = P, Vi = 1, M k (y k ') y 0, and M k _ dt (qiy k ') h (i = 1, . . . , m q ) for all k, k! with 
k' > k, we deduce Eqs. f[T5|) and ( I14p from the pointwise convergence of y kl — » y. □ 
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Proof of Theorem 1. By Lemma HJ the limit p = limk_ >OD p k exists. We obviously have that 
p < p*. We now show that there exist a set of operators X and a vector in a Hilbert space 
H (possibly of infinite dimension) that yield a feasible solution of P with objective value p. 
Thus, we also have that p > p*, and therefore p = p*. Incidentally, this implies that the 
minimum appearing in equation is well defined, i.e., it is not an infimum, as one would 
have expected in general. 

To build (H,X,(f)), we perform a Gelfand-Naimark- Segal like construction. Let y be the 
infinite sequence defined in Lemma [51 Consider the linear functional L y : K[x, x*] h-> K, 
p i — y L y (p) = Y^wPwVw Since M^(y) >z for all k, this linear functional is positive in the 
sense that L y (p*p) = J2 V w PlL y (v*w)p w > for all p. It thus defines a semi-inner product 
on through 

(p,q) = L y (p*q). (15) 

Define the set 

I = {p EK[x,x*] : (p,p) = 0} . (16) 

By the Cauchy-Schwarz inequality (which is valid for semi-inner products), the set J is a 
linear subspace of K[x, x*]. Moreover, it is a left ideal of K[x, x*]. To show that J is a left 
ideal of M,[x, x*], it is sufficient to show that X{1 C I for all i = l,...,2n. Since Mq is 
Archimedean, there is some C such that c = C 2 — Y2i x i x i ^ and, as in the proof of 
Lemma [3J one can show that Mk(cy) >z 0, from which it follows that 

< Ly(p*cp) = C 2 Ly(p*p) - J2 mp*x*x iP ). (17) 

i 

Since L y (p*x*Xip) > for all i, (fT7|) implies that 

< L y (p*x*x t p) < C 2 L y (p*p) . (18) 

For all p G /, we thus have that L y (p*x*Xip) = 0, that is, Xip G /. 

The definition ( JT5l) of (■, ■) induces a well defined inner product on the quotient K[x, x*]/I. 
Let H denote the Hilbert space corresponding to the completion of K[x,x*]/I with respect 
to this scalar product. We will now construct operators X on H. For every Xi, let Xi be the 
operator of left multiplication by X{ on K[x,x*]/I, i.e., 

X i (p + I) = x l p + I. (19) 

Since / is a left ideal, this map is well-defined for every Xi. It is linear, and by ffTS]) it is 
bounded. Thus it extends uniquely to a bounded operator on H, which we denote by the 
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same symbol X%. Note that the map is also consistent with the involution on K[x, x*], i.e., 
it satisfies X* = X i+n , since (p,X*q) = (Xip,q) = (xip,q) = Ly-(p*x*q) = L y (p*x i+n q) = 
(p,X i+n q). 

Let <j) be the vector of K[x,x*]/I corresponding to the identity polynomial 1. The fact 
that iji = 1 implies that the vector <j> is normalized: ((f), 0) = 1. From f fl3|) . it follows that 

(4>,P(X)4>) = ^p w (^w(X)4>) = ^p w (l,w) = ^p w y w =p. (20) 

W WW 

To show that (H, X, 0) yields a feasible solution to P with objective value p, it remains to 
show that the operators X satisfy qi(X) y (i — 1, . . . , m), i.e., that (r, qi(X)r) > for all 
r G H. But since any r G H can be approximated to arbitrary precision by elements of the 
pre-Hilbert space K[x,x*]/I, it is sufficient to show that (p,qi(X)p) > for all p G K[x, x*]. 
This follows from 

(p, qi(X)p) = (p, q t p) = Ly(p*q l p) = ^p* u Ly(v* "q l w)p w > , (21) 

v,w 

since Mk-dXqiil) h. for all k. □ 

3.3 Optimality detection and extraction of optimizers 

In this subsection, we introduce a criterion that allows to detect whether the relaxation of 
order k already yields the optimal value p*. If so, it is possible to extract a global optimizer 
(H*,X*,(p*) from the optimal solution of this relaxation. The procedure to extract this 
optimizer is described in the proof of the following theorem. 

Theorem 2. Assume that the optimal solution y k of the relaxation of order k satisfies 

rank M k (y k ) = rank M k _ d (y k ) , (22) 

where d = maxj di > 1. Then p k = p* , i.e., the optimum of the relaxation of order k is 
the global optimum of the original problem (j4j). Moreover, there exists a global optimizer 
(H*,X*,</>*) of 0) with dimH* = rankM k _ d (y k ). 

Proof. We show that when (1221) holds we can find a solution (H, X, <ft) to (jl]) with objective 
value p k . This implies that p k > p*, and thus p k = p* since we also have p k <p*. 

Let r = rank M k (y k ) = rank M k _ d {y k ) . Since the moment matrix M k (y k ) is positive 
semidefinite, it admits a Gram decomposition. That is, to each row (and column), indexed by 
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a w with \w\ < k, can be associated a vector w G K r such that M k (y k )(w, v) = y^, v = (w, v). 
We define the Hilbert space H as H = span{wT : \w\ < k}, with dimension dim if = r. Note 
that (122]) implies that 

H = span{w : \w\ < k} = spanjw : \w\ < k — d}. (23) 

We now define 2n linear operators Xi through their actions on the w's with \w\ < k — 1 in 
the following way 

XiW = x^w . (24) 

Note that when d > 1, the operators are well defined on the whole space H since by (J23l) 
the set of vectors w with |w| < k — d < k — 1 span ii. This definition is also consistent 
in the sense that if / G H admits two different decompositions f = ^2 a wW — S b w w as a 
linear combination of the vectors {w : \w\ < k — 1}, then ^ a w%iW = Y2b w XjW. Indeed the 
following equality 

(v, ^(a w - b w )x^w) = y^(a w - b w )y v * x . w = ^(a w - b w )y^ v) * w 

W WW 

= (^,J2K~b w )w) = {i^,0) = 0, (25) 

holds for all with |i>| < A; — d < k — 1. Since these vectors span the Hilbert space if, this 
implies that both vectors ^2 a w TjW and b w TjW are identical. The definition (|24|) is also 
consistent with the involution on K(x), i.e., it satisfies X* = Xi +n . Indeed, for all v,w of 
length \v\, \ w\ < k — 1, 

(v, X*w) = (X&, w) = (x&, w) = y k , x> = y k v * Xi+nW = (v, X l+n w) . (26) 

Let us now, define = 1. Let w be of length \w\ < 2k and write w = u>iu> 2 with 
\w 2 \ < k. Then (0,w(X)0) = (wl,w£) = y^ 1W2 = y*. This implies that (<f),p(X)(f)) = 
^2\ w \<2kPw(4>, w(X)<p) = J2\w\<2kPwVw = P h - It remains to check that the operators X satisfy 
Qi{X) y 0. To verify this it is only necessary, because of (1231) . to show that the matrix A 
with entries A(v,w) = (v,qi(X)w) with \v\, \w\ < k — d is a positive semidefinite matrix. 
This is the case, since A is equal to M k _ d {q i y k ) ) and is thus a submatrix of M k _ d .(qiy k ) y 0, 
which is itself positive semidefinite because y k is a solution of the relaxation of order k. □ 

Note that there exists a related optimality detection criterion in the commutative case, 
which is based on the flat extension theorem of Curto and Fialkow [161 03] • The matrix 
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Mk(y k ) is said to be a flat extension of Mk_d(y k ) if rankMfc(y fc ) = rankMfc_rf(y fe ). When 
this condition holds, the flat extension theorem permits (in the commutative case) to extend 
the finite sequence y k to an infinite sequence y satisfying rank My (y) = rank Mk(y k ) for all 
k' > k. The proof of Theorem 2 yields an NC analogue of this important result (simply 
define the infinite sequence y through y w = (0, w(X)<f)) where and X are the vectors and 
operators defined in the proof of Theorem 2) . 

3.4 Relation to the Positivstellensatz for non- commutative poly- 
nomials 

We now explain the link between the convergence of the SDP relaxations and the Posi- 
tivstellensatz for non-commutative polynomials introduced by Helton and McCullough [12] . 
We proceed by analogy with the link that exists in the commutative case between the con- 
vergence of Lasserre's relaxations [1] and Putinar's Positivstellensatz [TT] . 
Consider the problem 

A fe = max A 

s.t. p - X = J2j bjbj + YT=i £j ctfiidi (27) 
maxj deg(frj) < k, 
maxj deg(Qj) < k — di , 

where bj and are polynomials. The expression £\ b*bi is known as a sum of squares 
(SOS) and the above problem is a polynomial SOS problem. As shown in Appendix B, this 
polynomial SOS problem can be formulated as an SDP problem, which turns out to be the 
dual of Rk- This implies that the optimal solution of fT2T|) provides a lower bound on the 
solution of Rk, i.e., 

\ k <p k . (28) 

Alternatively, this last relation can be established as follows. Let A, bj, Cij be a feasible 
solution of f )2T|) and y be a feasible solution of f lTT]) . We show that L y {p—\) = J2 w PwVw — ^ > 
0, which implies (|28|) . As L y (p — A) = J2jL y (b*bj) + L y (c*jqiCij), it is sufficient to 

show that L y {b*bj) > and that L y (c*jqiCij) > 0. Writing bj = J2 w bj,wW, we find 

L y( b *j b j) = ^2 b *j,v L y( v * w ) b j,w 

v,w 

= ^bl v M k (y)(v,w)b jjW >0, (29) 

v,w 
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where we have used the definition ([6]) of the moment matrix M k {y) in the second equality 
and the property that M k {y) ^ to deduce the last inequality. Similarly 

! - ni'-; j'H'-u) = ^c*i,v^%uL y {v*uw)cij jW 

v,w u 

= ^2 c ij,v M k(liy)(v,w)c ijjW >0, (30) 

v,w 

where we have used the definition (|7j) of the localizing matrix Mk{qiy) and the property 
M k {q iV ) y 0. 

So far, we thus have that X k < p k < p* for all k. We note now from the definition (j3J) 
that for any e > 0, the polynomial p(X) — (p* — e) is strictly positive on Sq. It then follows 
from the Positivstellensatz representation theorem of Helton and McCulloughzl [12J that 



for some polynomials bj and cy. Let k > max^- {deg(bj), deg(cy) + di\. Then (A, b 
feasible solution of (1271) with objective value p* — e and therefore X k > p* — e. It follows that 
p* — e < X h < p k < p* , which implies p k — > p* since e > is arbitrary. 

We thus have just shown that the convergence of the relaxations Rk can be proved, 
alternatively to the proof given in Subsection 13.21 using the Positivstellensatz for non- 
commutative polynomials. In fact, both proofs are somewhat equivalent and the proof 
presented in Subsection 13.21 can itself be viewed as an undirect proof of the Positivestellen- 
satz for non- commutative polynomials. The advantage of the proof given in Subsection 13.21 is 
that it is more constructive in spirit and it inspired the proof of Theorem 2 where a procedure 
is given to build an optimizer (H*, X*, 0*). The proof that we have just given, on the other 
hand, connects with the fascinating theory of positive polynomials. We see for instance that 
an a priori bound on the maximal degree k necessary in the SOS decomposition ( J3TT) would 
yield information on the speed of convergence of the relaxations Rk- 



3.5 Dealing with equality constraints 

The problem P can contain a set of equality constraints e«(X) = (i — 1, . . . ,m e ), which 
can be enforced through the pairs of inequalities e{(X) y and — e*(X) y 0. Rather than 

2 The proof given by Helton and McCullough only covers the case of polynomials with real coefficients, 
but it is straightforward to generalize it to the complex case, see for instance [9]. 
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writing down directly the corresponding relaxations R&, it can be advantageous to exploit 
these equalities to reduce the complexity of the problem. 
The set of equalities 

E = {ei : % = l,...,m e } C K[x,x*] (32) 

generates the ideal 

I = fidgi : ft, gi G K[x, x*]} , (33) 

i 

which is such that any p G / satisfies p(X) = for operators X such that Ci(X) = 
(i = 1, . . . , m e ). It is therefore sufficient to express every polynomial p G K[x, x*\ modulo /, 
that is, to work in the quotient ring K[x, x*]/I. Let B denote a monomial basis for K[x, x*]/I. 
Then we only need to consider polynomial expressions of the form q = Y^web 5k> w s i nce f° r 
every polynomial p G K[x, x*), there exists a unique q = J2 w gb Qw w sucn ^ na ^ V — 1 ^ I- It is 
readily seen that all the results presented so far still hold when we work at relaxation step k 
with the reduced monomial basis Bk = B D Wk- The relaxation Rk then corresponds to an 
optimization over the set variables (y w )weB 2k and involves matrices Mk(y) and Mk-dXgiV) 
of sizes \Bk\ x \B^\ and \Bk-d t \ x \Bk-d\-, respectively. This represents a reduction in the 
complexity of the original problem. 

All the problem of course consists in building a monomial basis B for the quotient ring 
K[x,x*]/I. This can be done, e.g., if a finite Grobner basis exists and can be computed 
efficiently for the ideal / [17J. Here below we give two examples where such a reduced 
monomial basis B is readily obtained. 

3.5.1 Hermitian variables 

Polynomials in hermitian variables are elements of the free *-algebra K[x] with generators 
) and anti-involution * defined on letters Our previous results 

carry over to this situation if words are now viewed as built on the n letters x%, . . . , x n rather 
than the 2n letters X\, . . . , x n , x n+ x, . . . , x-m and if the anti-involution * is re-interpreted 
accordingly. Since the algebra is now based on n generators, the set of words of length d has 
\Wd\ — {n k+1 — l)/(n — 1) elements, compared to ((2n) k+1 — l)/(2n — 1) for the general case 
in 2n variables. The size of the optimization variables y and the dimension of the moment 
and localizing matrices in the SDP problem R^ are reduced accordingly. 
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3.5.2 Commuting variables and link with Lasserre's results 

The method that we have presented to solve optimization problems in non-commuting vari- 
ables also contains, as a particular case, the commutative version ([1]) considered by Lasserre 
since constraints of the type X^Xj — XjX$ = can explicitly be imposed on the operators 
X{. More precisely, the problem 

p c = min (cf>,p(X)<f>) 

(H,X,<f>) 

s.t. q l {X)> 1 i = l,...,m ( 34 ) 

XiXj - XjXi = 0, i, j = 1, . . . , n , 

where the variables X^ are assumed to be hermitian and all polynomials are expressed in 
terms of real coefficients, is identical to To show that and ([1]) are equivalent note 
that the operators X in any feasible solution (H, X, 0) of (I34I) generate an abelian algebra. 
Hence the Hilbert space H (or at least the part of H on which the operators X and the 
state (j) have support) is isomorphic to a direct integral J 9 H x dp:(x) of one-dimensional 
Hilbert spaces H x , and the operators Xj are decomposable as Xi = f X{ dji(x), where each 
Xi is a scalar operator that acts only on H x [18]. A priori, any point x G M. n defines a 
possible n-uple of operators (xx,...,x n ) and can be associated with a factor H x , but to 
satisfy (j3"3|) the measure dfi(x) should be such that f s dfi,(x) = 1 and f Rn \ s dp,(x) = 0, where 
S = {x G R" : qi(x) > 0, i — 1, . . . , m}. Thus (1341) is equivalent to 



(35) 



p c = min / p(x)dfi(x) 
» J 

s.t. / dfi(x) = 1, / dfi(x) = 
Js JR n \S 

where the minimum is taken over all measures \x on R n . As shown by Lasserre [T], the 
problems (|35j) and are equivalent. Indeed, as p(x) > p* on S, f pdji > p* and thus 
pc > On the other hand, if x* is a global minimizer of ([1]), then the measure /i* = 5 X * is 
admissible for (I3l)|) . and thus p c < p*. 

The relaxations Rk are constructed on the canonical basis of non-commutative monomi- 
als, for instance for n = 2, W2 = {I, x±, X2, xf, X1X2, X2X1, x^} ■ Simplifying these relaxations 
using the constraints X{Xj — XjXi = amounts to consider only the canonical basis of com- 
mutative monomials, e.g., W2 = {1, x±, x 2 , x\, xiX 2 , x 2 }, which lead to the exact same con- 
struction as the one introduced by Lasserre. In particular, the criterion for detecting global 
optimality presented in subsection 13.31 coincides with the detection criterion introduced in 
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the commutative situation [19J. If we apply the procedure outlined in the proof of Theorem 2 
to extract optimal solutions from the solution of a finite order relaxation R^, we end up with 
a set of operators X = (Xi, . . . , X n ) which are matrices each of dimension r = rank M/ S (y k ) . 
As these matrices all commute, they can be simultaneously diagonalized, with each set of 
common eigenvalues • • • , x n (j)) (j = 1, . . . , r) corresponding to one optimal solution 

of (jTJ). We thus see that if the rank of the moment matrix r = rankMfc(?/ fc )) is related to 
the Hilbert space dimension of the global optimal solution in the non-commutative case, it 
is related to the number of global solutions extracted by the algorithm in the commutative 
case. 

It is interesting to note that most of our results, such as the convergence of the hierarchy 
or the criterion to detect optimality, are easier to establish in the general non-commutative 
framework than they are in the specialized commutative case. Note also that it may be easier, 
from a computational point of view, to solve the non- commutative version of a problem than 
it is to solve the commutative one. In particular, the speed of convergence of the SDP 
relaxations may be faster in the non-commutative case than in the commutative one. This 
is dramatically illustrated on the following example. 

Let p be a polynomial of degree 2 and consider the quadratic problem 

p* = min (4>,p(X)(f>) 

{H,x,<t>) ( 36 ) 

Xf-X i = i = l,...,n, 
where the variables X,- are assumed to be hermitian. Its first order relaxation is 



p 1 = min Y^aPwVw 

y 

s.t. yi = l 

MM h 

yu-yi = i = l,...,n. 



(37) 



Any feasible point y of the above SDP problem with objective value p(y) = ^2 w p w Vw defines 
a feasible point of ( 13T)j) with objective value (<f>,p(X)<f>) = p(y), and therefore p 1 = p*, i.e., 
the first order relaxation already yields the global optimum of the original problem. To 
see this, perform a Gram decomposition of the matrix Mi(y): M\(y)(v,w) = y vw = (v,w), 
where \v\, \w\ < 1, i.e., v , w G {1, xi, . . . , x n }. Define the vector 0=1, which is normalized 
since (1, 1) = y\ — 1, and the operator Xi (i = 1, . . . , n) as the projectors on xl. Obviously, 
Xf = Xi. Moreover, Xi<p = Xixl + Xj(0 — Wl) = xl, where the last equality follows from 
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the fact that {xl, 4> — xl) = y% — yu = 0. This implies that y vw = (4>,v(X)w(X)4>) for 
v,w G {1, xi, . . . , x n } and therefore that p(y) = (<f>,p(X)<f>) since p is of degree 2. Using 
similar arguments, one can actually show that the minimization of a polynomial of arbitrary 
degree evaluated over projection operators can always be determined from the first relaxation 
of the problem. 

The commutative version of ( 135]) is the quadratically constrained quadratic program 

p* = min p(x) 

*GR" ( 38 ) 

s.t. xf — Xi = i — l,...,n. 

Since 0-1 integer programming can be formulated in this form, it is NP-hard to solve a 
general instance of (I38p . Thus, contrary to the non-commutative case, it is highly unlikely 
that considering relaxations up to some bounded order might be sufficient to solve this 
problem. 



3.6 Generalization 



In this subsection, we introduce a slight generalization of the problem fll]) to which our 
method readily extends. We state the results without entering in the details of the proofs. 



In addition to the polynomials p and {qi : i - 
of polynomials {r^ : i — 1, . . . , m r } and {si : i 
The problem that we consider is 



, m q } defined in PJ, consider the sets 
. ,m s }, where the Sj's are hermitian. 



P 



P : 



mm 

(H,x,4>] 

s.t. 



m 



<J ! 



(39) 



qi{X) y 

n(x)<j) = o 

(0, Sj (X)0)>O 

We thus not only require that the operators X satisfy qt(X) y but we also require that 
Ti(X) acting on yield the null vector and that the average value of Si(X) be positive. As 
before we assume that Q = {q% : i — 1,. . . ,m q } is such that the quadratic module Mq is 
Archimedean. 

For r G K[x,x*] rf and y = {y w } \ w \<k+d, a sequence indexed in Wk+d, define the vector 
m k{ r y) as the vector with components indexed in W4 and whose component w is equal to 



m k (ry)(w) 



L y {wr) 



^ Tvywv 
\v\<d 



(40) 
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If y admits a moment representation (JHJ) such that r(X)0 = 0, then m k (ry) = 0, since 

™ k (fy)H = ^r^ = ^r„(0,«;(X)t;(X)0) = (0,«;(X)r(X)0) = O. (41) 

V V 

If in addition y admits a moment representation such that (0, s(X)<p) > 0, then obviously 
S w s wVw > 0. These observations motivate the following definition. 

For 2k > max {deg(p), deg(gj), deg(rj), deg(sj)}], we define the relaxation of order k as- 
sociated to the problem P as the SDP problem 

p k = min Y. w Pu>y™ 
y 

s.t. M k {y) y 

Rk : 2/1 = 1 (42) 

M k _ d Xqiy)yO i = l,...,m q 

m 2fc -4 (ny) = i = l,...,m r 

Z^tw ^i,wy-w ^ i 1, . . . , Tfl s , 

where di = |~deg(gj)/2] , d\ = deg(rj), and the optimization is over y E K) w ^. It is easily 
verified that p k > p k ^ when k < k', and that p k < p* for all k. 

The results obtained in Subsection 3.2 and 3.3 can easily be adapted to the above situa- 
tion. 

Theorem 3. //Mq is Archimedean, lim^oo]^ = p* ■ 

Theorem 4. Assume that the optimal solution y k of the relaxation Rk of order k satisfies 

rank M k (y k ) = rank M fc _ d (y k ) , (43) 

where d = maxj di > 1, and 

d[-d<k (44) 

for alii = 1, . . . ,m r . Thenp k = p* , i.e., the optimum of the relaxation of order k is the global 
optimum of the original problem P. Moreover, there exists a global optimizer (H*,X*,<f)*) 
ofP with dimH* = rank M k -d{y k ) ■ 

The proof of both these theorems follow along the same line as the proofs of Theorem 1 
and Theorem 2, respectively. One has simply to show that the reconstructed operators X 
and the state (f> satisfy the additional properties ri(X)cf) = and (</>, Sj(X)0) > 0. This can 
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be established given the conditions m.2k-d'.( r iy) — and J2 W Si )W y w > present in Rk. The 
additional constraint (I44p with respect to Theorem 2 comes from the fact that to show that 
ri(X)(p = 0, we need to show, because of fl23|) . that (w, rj(X)0) = for all \w\ < k — d. This 
is implied by m2k-d'{ r iy) — when 2k — d\ > k — d, i.e., when (jUJ) is satisfied. 
The duals of the relaxations Rk can be shown to be equivalent to the problems 

X k = max A 

s.t. p - A = J2j h *j h 3 + ESl Ej cfjftCji 

maxj deg(6j) < A;, 

(45) 

maxj deg(cjj) < k — di, 
deg(/i) < 2A; - <, 

where 6j, Cjj, /j are polynomials and ^ are real numbers. From the decomposition of p — X k 
appearing in ( 143]) . it clearly follows that p(X) — X k > for any (H,X,<p) satisfying the 
constraints in P. Thus the solution of the dual (145]) provides a certificate that the optimal 
solution p* of P cannot be lower than A fc . 

Finally, we mention that it is possible, taking inspiration from [20], to generalize the 
problem fll]) and the results associated to it to the case of matrix-valued polynomials, that 
is, polynomials ^2 w p w w, where each coefficient p w is now an a x b matrix with entries from 
K. A Positivstellensatz also exists in this case [T2"] . 

4 Illustration of the method 

For the sake of illustration, we now apply our approach on simple examples. To simplify 
the notation, through all this section we label monomials (i.e. words) by the indices of the 
ordered non-commutative variables of which they are composed. For instance, the word 
w = x<iX\X2X2 will be referred to as 2122. The empty word 1 corresponding to the identity 
element of the algebra will be labeled by the symbol 0. 

Our first example involves two hermitian variables X\ = X{ and X2 = X% and has the 
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form 



P 



min 

(H,X,d>] 

s.t. 



X1X2 + XlX\t 



(46) 



X{ - x x = 
-x| + x 2 + i/2 y 0. 

Since all constraint and objective variables are at most of degree 2, the first order relaxations 
is associated with the monomial basis W2 = {1, Xi, £2, X1X2, x%x\, where, following the 
approach of Subsection 3.5, we used that x\ = X\. The first relaxation step thus involves 
the relaxed variables {2/0,2/1,1/2,2/12,1/21,2/22} and corresponds to the SDP problem 



P 



mm 2/12 + 2/21 
y 





1 


2/i 2/2 




s.t. 


2/i 


2/i 2/12 


b 




. 2/2 


2/21 2/22 






-2/22 + 2/2 + 1/2 > . 



(47) 



We solved this SDP problem using the Matlab toolboxes YALMIP [21] and SeDuMi [22 
After rounding, we obtain the solution p 1 = —3/4, achieved for the moment matrix 





1 


3/4 


-1/4 


M l = 


3/4 


3/4 


-3/8 




-1/4 


-3/8 


1/4 



with eigenvalues 0, 1 ± v37/8. The second order relaxation is 



(48) 



P 



mm 2/12 + 2/21 
y 



s.t. 



1 


2/i 


2/2 


2/12 


2/21 


2/22 


2/1 


2/i 


2/12 


2/12 


2/121 


2/122 


2/2 


2/21 


2/22 


2/212 


2/221 


2/222 


2/21 


2/21 


2/212 


2/212 


2/2121 


2/2122 


2/12 


2/121 


2/122 


2/1212 


2/1221 


2/1222 


2/22 


2/221 


2/222 


2/2212 


2/2221 


2/2222 



>- 



-2/22 +2/2 + | 


-2/221 + 2/21 + 


1^1 


-2/222 + 2/22 + 


§2/2 


-2/221 + 2/21 + 


-2/1221 + 2/121 + 




-2/1222 + 2/122 + 


53/12 


-2/222 + 2/22 + |2/2 


-2/1222 + 2/122 + 


53/12 


—2/2222 + 2/222 + 


I2/22 



^ 



(49) 
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with solution p 2 = 



—3/4. The moment matrix associated to this solution is 



Mo 



1 
1 


3/4 


1 1 A 
-1/4 


Q /S 
— O/ O 


— o/o 


1 1 A 

1/4 


3/4 


3/4 


-3/8 


-3/8 


-3/16 





-1/4 


-3/8 


1/4 


3/16 





1/8 


-3/8 


-3/8 


3/16 


3/16 


3/32 





-3/8 


-3/16 





3/32 


3/16 


-3/16 


1/4 





1/8 





-3/16 


1/4 



which as two non-zero eie; envalues 3/32 x (14 ± VET). 



(50) 



Optimality criterion and extraction of optimizers. Since the matrix M 2 has two 
non-zero eigenvalues, it has rank 2. Let Mi(y 2 ) be the upper-right 3x3 submatrix of 
Mi = M2(y 2 ). This submatrix is, in fact, equal to (I48j) and has thus also rank 2. The 
matrices Mi(y 2 ) and M 2 (y 2 ) have thus the same rank and the condition ( 122]) of Theorem 2 
is satisfied. It follows that p* = p 2 = —3/4. It also follows that we can extract a global 
optimizer for ( 14"T|) . which will be realized in a space of dimension 2. For this, write down the 
Gram decomposition M 2 = R T R, where 



R 



1 3/4 -1/4 -3/8 -3/8 1/4 
v/3/4 -V3/4 -v^/8 \/3/8 -V3/4 



(51) 



Following the procedure specified in the proof of Theorem 2, we find the optimal solutions 



3/4 v^/4 
\/3/4 1/4 



-1/4 
-V3/4 



>/3/4" 


, 0* = 


" 1 " 


5/4 








(52) 



Dual. Solving the dual of the order 1 relaxation (14T|) yields, in the notation of Appendix B, 
the solutions 



A = - 


-3/4 








" 1/4 


-1/2 


-1/2 


V = 


-1/2 


1 


1 




-1/2 


1 


1 



w 



1 . 



(53) 
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The matrix V has only one non-zero eigenvalue and can be written as V = aa T where 
a = [—1/2,1,1]. In the formlation of (l2~7|) . this corresponds to an SOS decomposition for 
X\X2 + x 2 Xi of the form 

(54) 



XiX 2 + X 2 Xi 



1 



+ Xi + x 2 I +1 -x 2 + x 2 + - 



It immediately follows that (<p,XiX 2 + X 2 Xi<f>) > —3/4 for every (if, X, 0) satisfying X\ = 
X\ and —X\ + X 2 + ~ ^ 0. Thus the decomposition ( I54p provides a certificate that the 
solution (|52|) is optimal. 

Comparison with the commutative case. To illustrate the differences and similarities 
between the non-commutative and commutative case, let 



p* = min 2xiX 2 

s.t. x\ — X\ = 

-x\ + x 2 + 1/2 > 



(55) 



be the commutative version of ff46|) . The first relaxation step associated to this prob- 
lem involves the monomial basis W 2 = {1, Xi,x 2 , X\x 2 , x%\ (we used Xix 2 = x 2 Xi) and 
the corresponding relaxation variables {2/0,2/1,2/2,2/12,2/22},- This should be compared to 
W 2 = {l,x\,x 2 , xix 2 , x 2 xix^} and {y^, 2/1,2/2, 2/12, 2/21, 2/22} in the non-commutative case. The 
first order relaxation associated to fl55l) is thus 



min 2 y 12 

y 





1 


2/1 2/2 




s.t. 


2/1 


2/1 2/12 


h 




. V2 


2/12 2/22 






-2/22 + 2/2 + 1/2 > . 



(56) 



Note that ( HTj) and ( |56i) are in fact identical, because the hermicity of the moment matrix 
in (|47j) implies that j/12 = 2/21- In general, it always happen that the first order relaxations 
of the commutative and non-commutative version of a problem coincide. We thus find as 
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before that p 1 = —3/4. The relaxation of order two of ( |56|) . however, is 



min 2y 12 

y 



s.t. 



1 


yi 


2/2 


2/12 


2/22 


J/i 


yi 


2/12 


2/12 


2/122 


Vi 


yi2 


2/22 


2/122 


J/222 


yi2 


yi2 


2/122 


2/122 


J/1222 


_ J/22 


2/122 


1/222 


2/1222 


J/2222 



>- 



-J/22 + J/2 + 


1 

2 


-J/122 + J/12 + \Vl 


-J/222 + J/22 + 


|2/2 




-2/122 + J/12 + 


l2/i 


-J/122 + J/12 + \V\ 


-J/1222 + J/122 + 


|2/12 


h 


-J/222 + J/22 + 


\V2 


-J/1222 + J/122 + \V\2 


—J/2222 + J/222 + 


\V22 





(57) 

Solving it, we obtain p 2 = 1 — -\/3 ~ —0.7321. Again, it can be verified that the rank 
condition (12"2"|) of Theorem 2 is satisfied, so that this solution is optimal, and the following 
optimizer can be reconstructed: 



V3)/2. 



(58) 



As expected, the global minimum of ( 155]) is higher than the one of ( T4l)]) as the commutative 
case is more constrained than the non-commutative one. 



Additional constraints. We now consider a problem of the form (139]) by adding two 
constraints to ( 146]) : 

p* = min (<f>,X 1 X 2 + X 2 X 1 <f)) 
(H,x,4>) 

s.t. Xf -X 1 = 

-Xi + X 2 + 1/2 hO ( 59 ) 
(3Xi + 2X 2 - 1) = 
-(faXtf) + 1/3 > 0. 
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Following ( 142|) . the corresponding first order relaxation is 



min y 12 + 2/21 
y 



s.t. 



Vl 2/2 



v\ yi yi2 h ° 

2/2 2/21 2/22 
-2/22 + 2/2 + 1/2 > 

3y a + 2y„ - y u = 
-2/i + 1/3 >0, 



(60) 



where Ji = {(1, 2, 0), (1, 12, 1), (2, 22, 2)}. This problem admits the solution p 1 
achieved for the moment matrix 



-2/3, 





1 


1/3 





M x = 


1/3 


1/3 


-1/3 







-1/3 


1/2 



(61) 



with eigenvalues 0, 2/3, and 7/6. The solution p 1 = —2/3 thus yields a lower-bound on p*, 
which is already higher, as expected, than the optimal solution of f|46l) . The second order 
relaxation is 



p = min ?/i2 + 2/21 
y 



s.t. 



1 


2/1 


2/2 


2/12 


2/21 


2/22 


2/i 


2/1 


2/12 


2/12 


2/121 


2/122 


2/2 


2/21 


2/22 


2/212 


2/221 


2/222 


2/21 


2/21 


2/212 


2/212 


2/2121 


2/2122 


2/12 


2/121 


2/122 


2/1212 


2/1221 


2/1222 


2/22 


2/221 


2/222 


2/2212 


2/2221 


2/2222 



>- 



32/™ + 2y„ - y u = 
- 2/i + 1/3 > , 



(w,v,u) e J 2 



(62) 



-2/22 + 2/2 + 


1 

2 


-2/221 + 2/21 + §2/1 


-2/222 + 2/22 + 


§2/2 




-2/221 + 2/21 + 


|2/1 


— 2/1221 + 2/121 + §2/1 


-2/1222 + 2/122 + 


§2/12 


b 


-2/222 + 2/22 + 


|2/2 


— 2/1222 + 2/122 + 2-2/12 


—2/2222 + 2/222 + 


§2/22 
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where J 2 = {(1, 2, 0), (1, 12, 1), (21, 22, 2), (121, 122, 12), (21, 22, 21), (221, 222, 22), 
(121, 1212, 121), (1221, 1222, 122), (2121, 2122, 212), (221, 2212, 221), (2221, 2222, 222)}. It ad- 
mits the solution p 2 = —2/3 with 



M, 



1 


1/3 





-1/3 


-1/3 


1/2 


1/3 


1/3 


-1/3 


-1/3 





-1/6 





-1/3 


1/2 


1/3 


-1/6 


1/2 


-1/3 


-1/3 


1/3 


1/3 





1/6 


-1/3 





-1/6 


-0 


1/6 


-1/3 


1/2 


-1/6 


1/2 


1/6 


-1/3 


3/4 



(63) 



which as two non-zero eigenvalues 17/12 and 5/3. As in the previous examples, it is easily 
verified that the rank condition fl22|) is satisfied, and we thus deduce that p* = p 2 = —2/3. 
From the Gram decomposition M 2 = R T R, with 



R 



1 1/3 -1/3 -1/3 1/2 
y/2/3 -y/2/2 -y/2/3 y/2/Q -y/2/2 



(64) 



one obtains the global optimizer 
XI-- 



1/3 y/2/3 
y/2/3 2/3 



X* 



-y/2/2 ' 




1 


y/2/2 1 








(65) 



Finally, the dual of the first order relaxation ( 160]) yields the SOS decomposition: 

1 



X\X 2 + x 2 x 1 



- (-1 + 3xi + 2x 2 ) + - ( -x\ + x 2 + - 

1 . , 1 . 

+-xi (3x 1 + 2x 2 — 1) + - {oxi + 2x 2 - 1) x\ 
6 6 



X\ 

3 



(66) 



which clearly implies p* > —2/3. 



5 Applications 

The results presented so far have immediate applications in quantum theory and quantum 
information science. Since the dimension of the underlying Hilbert space is not specified in 
the optimization problem PJ or (I3"9~I) . they are well adapted to situations where we want 
to optimize a quantity over all its possible physical realizations, that is to say, over Hilbert 
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spaces of arbitrary dimension. Computing the maximal quantum violation of a Bell inequal- 
ity is an example of this sort. 

Let Si, ... , Sn be a collection of finite disjoint sets. Each Sk represents a measurement 
that can be performed on a given system and the elements i G Sk are the possible outcomes of 
the k- measurement. We suppose that the system is composed of two non-interacting subsys- 
tems, and that measurements Si, . . . ,S n are performed on the first system and measurements 
S n+ i, . . . , Sn on the second. We put A — Si U . . . U S n , B = S n+ i U . . . U Sn, and denote by 
P(ij) the joint probability to obtain outcome i G A and outcome j G B when measurements 
associated to these outcomes are made on the first and second subsystems, respectively. In 
quantum theory, these probabilities are given by P(ij) = ((f), EiEj<f>) , where describes the 
state of the system under observation and the self-adjoint operators Ei describe the mea- 
surements performed on <p. The measurement operators {Ei : i 6 4} associated to the 
measurement Sf. form an orthogonal resolution of the identity, and operators corresponding 
to different subsystems commute, i.e., [Ei, Ej] = when i G A and j G B. 

For our purposes, a Bell inequality is simply a linear expression ^ . CijP(ij) in the joint 
probabilities. We are interested in the maximal value that this quantity can take over all 
probabilities P(ij) that admit a quantum representation. This amounts to solve the problem 



which is a particular instance of the non-commutative optimization problem (j3J) and involves 
polynomials of degree at most 2. Note that 1 — J2 ieSk Ef = (1 — J2 ie s k -^«) + X]ies fe (-^ — Pf) = 
0, and thus the quadratic module associated to the constraints in (157)) is Archimedean. The 
sequence of SDP relaxations associated to (1671) thus converges to the optimal solution. This 
particular sequence of SDP relaxations is the one already introduced in jH [5] and the source 
of inspiration for the present work. It represents the unique tool that is currently available 
to compute the maximal violation of a generic Bell inequality. It has been applied up to the 
third order in [23] to derive upper-bounds on the maximal violation of 241 Bell inequalities. 
The resulting upper-bounds are tight for all but 20 of these inequalities; for the remaining 



mm 

(H,E,4>) 




s.t. EiEj = dijEi VSfc and Vi, j G Sk 




(67) 



[Ei, Ej] =0 Vz G A and Vj G B , 
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20 inequalities the gap between the upper bound and the best known lower bound is small. 

The sequence of SDP relaxations introduced here can also be used to decide if a given 
set of probabilities P(ij) admits a quantum representation jU [5]. More generally, it has 
the potential to find other applications in quantum information science, see for instance 



Besides applications where the dimension of the underlying Hilbert space is not fixed, 
the optimization problems (jl]) and (1391) are also well suited to problems where the Hilbert 
space is the unique irreducible representation space of a set of operators satisfying algebraic 
constraints. Consider, for instance a system of N electrons that can occupy M orbitals, each 
orbital being associated with annihilation and creation operators a, and a\, i = 1, . . . , M (we 
use the common physics notation f for the conjugate transpose). Since electrons interact 
pairwise, the hamiltonian for such a system involve only two-body interactions and its ground 
state energy can be computed as 



The first three constraints represent the usual anticommutation fermionic relations, while 
the last constraint fixes the number of electrons to N. This problem is a particular case of 
( |39l) and it involves polynomials of degree 4. Note that the algebra of operators generated 
by (168]) has a unique irreducible representation of dimension 2 M . Since a product (in normal 
order) of more than N of the operators {a^aj} vanishes, the sequence of SDP relaxations 
halts at order N, and thus p N = p*. 

The hierarchy of SDP relaxations associated to the problem (168p can be used, for instance, 
to compute the ground state electronic energy of atoms or molecules. In the last years, very 
successful SDP methods based on the iV-representability problem have been independently 
introduced in quantum chemistry to compute these electronic energies [251 EE] . Our hierarchy 
of SDP relaxations actually reduces to these existing SDP techniques. But our approach is 
more general, and can be used to compute the ground-state energy of other many-body 



0CEO112I]. 




ijkl 



S.t. 




(68) 
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systems, such as spin systems or systems described by unbounded operators satisfying the 
canonical relations [x,p] = i (in which case it has to be slightly adapted). These applications 
will be presented in a forthcoming paper. 

Finally, the method presented here might also prove useful for problems where the Hilbert 
space dimension is fixed in advance. Consider for instance a polynomial optimization problem 
of the form where dim H = r, i.e, where the operators X are r x r matrices. We may 
in principle solve such a problem by introducing an explicit parametrization of the matrices 
X and by using Lasserre's method for polynomial scalar optimization [1] or its extension 
taking into account polynomial matrix inequalities [20]. This would necessitate, however, to 
introduce of the order of r 2 scalar variables for each operator Xj. This renders this approach 
impractical even for small problems. In comparison, the method presented here treats each 
matrix as a single variable. Although it only represents a relaxation of the original problem 
since the Hilbert space dimension is not fixed (in particular we have no guarantee that the 
sequence of relaxations will converge to a solution with dim H = r), it may nevertheless 
provide a cheap way to compute lower-bound on the optimal solutions of these problems 
when it is too costly to introduce an explicit parametrization. 
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Appendix A: Basics of semidefinite programming 



Semidefinite programming |27J is a subfield of convex optimization concerned with the fol- 
lowing optimization problem, known as the primal problem 

minimize c T x 

in 

subject to F(x) = ^XiFi - G h . (69) 

i=l 

The problem variable is the vector x with m components Xi and the problem parameters are 
the n x n matrices G, Fi and the scalars q. A vector x is said to be primal feasible when 
F(x) > 0. 

For each primal problem there is an associated dual problem, which is a maximization 
problem of the form 

maximize tv(GZ) 

subject to trFjZ = c$ i — 1, ...,m (70) 

z y o 

where the optimization variable is the nxn matrix Z. The dual problem is also a semidefinite 
program, i.e., it can be put in the same form as (I7U1) . A matrix Z is said to be dual feasible 
if it satisfies the conditions in (1701) . 

The key property of the dual program is that it yields bounds on the optimal value of the 
primal program. To see this, take a primal feasible point x and a dual feasible point Z. Then 
c T x — tr (GZ) = Y^iLi^i^F^Xi — tr(GZ) = tx{ZF{x)) > 0. This proves that the optimal 
primal value p* and the optimal dual value d* satisfy d* < p*. In fact, it usually happens 
that d* = p* . A sufficient condition for this to hold is that the dual (primal) problem admits 
a strict feasible point, that is, that there exists a matrix Z y (F(x) y 0)) that is dual 
(primal) feasible [27] . We refer the reader to the review of Vandenberghe and Boyd [27] for 
further information on SDP. 

There exist many available numerical packages to solve SDPs, for instance for Matlab, 
the toolboxes SeDuMi |22j and YALMIP [21J. These algorithms solve both the primal and 
the dual at the same time and thus yields bounds on the accuracy of the solution that is 
obtained. 
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Appendix B: Duals of the SDP relaxations 



Here we show that the duals of the relaxations Rk defined in (11 ip correspond to the problems 
(12 7p . To simplify the presentation, we do this explicitly only in the case where we are dealing 
with polynomials defined in the real free *— algebra R[a;, a;*] and where the SDP relaxations 
(ITTp only involves real quantities. The more general case of complex SDP relaxations can be 
treated similarly by decomposing them in real and imaginary parts. 

Write M k (y) = Yjw B wVw and M k-dMiU) = J^w^wVvi for appropriate symmetric matri- 
ces B w and C l w . The SDP relaxation (jlip is then expressed as an SDP problem in primal 
form ( 1551) and its dual is 

X k = max A 

X,V,Wi 

s.t. Pl = A + tr (B 9 V) + YZi te{ClWi) 

p w = tr (B W V) + YZi tr(<^) (VO < \w\ < 2k) (71) 

v t o, 

Wi y 0, i = 1, . . . ,m, 

where Ael,7e M |Wfel x Rl w *l, and G Rl w *-«^l x Rl w *-<*J. 

The terms on the left hand-sides of the above equality constraints are the coefficients in 
the canonical basis of monomials W2k = {w : \w\ < 2k} of the polynomial p. The quantities 
tr(B w V) on the right-hand side are the coefficients of a polynomial of the form £V fr/fyj where 
each 6j is a polynomial of degree fc. Indeed, it is easily seen from the definition of the moment 
matrix Mk{y) that the entries of the matrices B w satisfy B w (u, v) = 1 if w = u*v or w = v*u 
and B w (u,v) = otherwise. It follows that J2\w\<2k^ T (B w V)w = . ui<& K« w*^, where we 
used that V is symmetric. As V is positive semidefinite, we can write V = HjCtjCtJ, where 
fij > are the eigenvalues of V and aj the corresponding eigenvectors. Using this expression 
for V, we obtain that ^|u,|<2fc tr(B w V)w = ^ fija*a>j, which is of the announced form 
with bj = sfjHCLj. In a similar way, it can be shown that ^\ w \ <2 k ^(C^W^w = Ylj^ifl^ij- 
Putting all together, we find that the the problem ( ITT]) is equivalent to 

A fc = max A 

s.t. p - A = J2j b *jbj + E™ i Ej c ijU c v (72) 
max,,- deg(bj) < k, 
maxj deg(c i: , ) < k — . 
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In the case of polynomials defined on C[x,x*], the dual of ( 11 II) has the same form as above, 
but now all polynomials are allowed to take complex coefficients. 

A similar analysis can be carried to show that the problems ( )42|) and ( )45|) are dual to 
each other. 
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